News |
---|
Ion JS 4.2.0 and 4.2.1 Released |
Ion Java 1.8.1 Released |
Ion C 1.4.0 Released |
Ion Go 1.1.0 Released |
Ion Python 0.7.0 Released |
Amazon Ion is a richly-typed, self-describing, hierarchical data serialization format offering interchangeable binary and text representations. The text format (a superset of JSON) is easy to read and author, supporting rapid prototyping. The binary representation is efficient to store, transmit, and skip-scan parse. The rich type system provides unambiguous semantics for long-term preservation of data which can survive multiple generations of software evolution.
Ion was built to address rapid development, decoupling, and efficiency challenges faced every day while engineering large-scale, service-oriented architectures. It has been addressing these challenges within Amazon for nearly a decade, and we believe others will benefit as well.
Available Languages: C – C# – Go – Java – JavaScript – Python
Related Projects: Ion Hash – Ion Schema
Tools: Hive SerDe
Getting Started
All JSON values are valid Ion values, and while any value can be encoded in JSON (e.g., a timestamp value can be converted to a string), such approaches require extra effort, obscure the actual type of the data, and tend to be error-prone.
In contrast, Ion’s rich type system enables unambiguous semantics for data (e.g., a timestamp value can be encoded using the timestamp type). The following illustrates some of the features of the Ion type system:
- timestamp: arbitrary precision date / timestamps
2003-12-01T 2010-03-22T18:00:00Z 2019-05-01T18:12:53.472-0800
- int: arbitrary size integers
0 -1 12345678901234567890...
- decimal: arbitrary precision, base-10 encoded real numbers
0. -1.2 3.141592653589793238... 6.62607015d-34
- float: 32-/64-bit IEEE-754 floating-point values
0e0 -1.2e0 6.02e23 -inf
- symbol: provides efficient encoding for frequently occurring strings
inches dollars 'high-priority' // symbols with special characters ('-' in this example) // are enclosed in single-quotes
- blob: binary data
{{ aGVsbG8= }}
- annotation: metadata associated with a value
dollars::100.0 height::inches::72 lotto_numbers::[7, 9, 19, 40, 42, 44]
The Specification provides an overview of the full set of Ion types.
Binary Encoding
Ion provides two encodings: human-readable text (as shown above), and a space- and read-efficient binary encoding. When binary-encoded, every Ion value is prefixed with the value’s type and length. The following illustrates a few of the efficiences provided by Ion’s binary encoding:
- The following timestamp encoded as a JSON string requires 26 bytes: “2017-07-26T16:30:04.076Z”. This timestamp requires just 11 bytes when encoded in Ion binary:
6a 80 0f e1 87 9a 90 9e 84 c3 4c
That first byte
6a
indicates the value is a timestamp (type6
) represented by the subsequent 10 bytes (that’s what thea
represents). If this particular timestamp value is not of interest, a reader can jump over the value by skipping 10 bytes. This ability to skip over a value enables faster navigation over Ion data. -
Binary encoding of a symbol replaces the text of a symbol with an integer that can be resolved to the original text via a symbol table. This can result in substantial space savings for symbols that occur frequently!
- While blob data is base-64 encoded in text (which produces 4 bytes for every 3 bytes of the original data), a blob encoded as Ion binary is simply encoded as is—no base-64 expansion required!
Similar space efficiencies are found in other aspects of Ion’s binary encoding.
Give Ion a Try!
| |
made with ion-js |
More Information
To learn more, check out the Docs page, or see Libs for the officially supported libraries as well as community supported tools. For information on how to contribute, how to contact the Ion Team, and answers to the frequently asked questions, see Help.